Programming Basics

Programming Basics: Constructs

It is expected that the information presented here will enable the reader to gain the basic concepts of computer programming. These concepts apply to common procedural language programming of a binary computer, but are language and computer independent. The reader will learn the basic programming constructs that instruct a computer to operate on data. How data (numbers and characters) are stored in a computer was covered in the previous section, Programming Basics: Data.

Basic Programming Constructs

Common to most programming languages, particularly procedural style languages that execute on a sequential instruction execution computer, are various basic programming constructs. The constructs are variable assignment, which encompass variable type and expressions, conditional testing, branching and looping, input and output, and subroutines.

Variable Assignment

One of the most fundamental requirements of a programming language is the ability to assign values to variables. Variables are symbolic names that refer to a storage location where the value of the variable is stored. It is called a variable in that the value that it contains can be changed while the program is executing. To help programmers write robust code, variables are assigned types so that the value in the variable is interpreted in the correct fashion. Variables are assigned values by the use of expressions. Both of these topics are covered next.

Variable Types

The types of values that can be contained in a variable by in large belong to a few basic groups. A value can represent either a logical, numeric or character value. The numeric value is further distinguished between an integer and a floating-point value. Each of the variable types can be grouped into what is called an array or table. An array is a collection of values, all of the same type that represent multiple instances of a variable. A special case is a constant, which is a type of variable whose value never changes. Categorizing a constant as a variable type is a misnomer but for convenience sake it is defined this way.

Logical

Logical variables contain Boolean values. A Boolean value indicates whether something is true or false. Therefore a variable that is a logical type can only contain one of two values, either true or false. In the expression section we will see how multiple logical variables can be operated upon to produce a new true or false value.

Examples:
Boolean logicalvar; //declaring the variable named logicalvar to be of type Boolean
logicalvar := 1; //a variable assignment, typically 1 means true
logicalvar := 0; //a variable assignment, typically 0 means false

More variable assignment, some language have a keyword for true and false and the actual value that codes the true and false are hidden from the programmer.
logicalvar := true;
logicalvar := false;

Numeric

Numeric variables contain values that represent numbers. These numbers can either be integer or floating-point values. Integer values are whole numbers; they do not have a decimal point. Floating-point values do have a decimal point and thus can represent fractional parts of a number like 23.341. Each of these two basic types of numbers can use different amounts of memory to contain their values. The amount of memory that is reserved for the numbers will affect the range of values the number can accurately represent. Recall from the Data Representation Size Units section that the number of bits determines all the possible values that are contained within a size unit.

Integer
Integer values are typically defined to use different amounts of memory. The motivation for this is conserve memory. If the range of possible values that the numeric integer variable is to contain can be stored in that amount of memory, then use that type so that memory is not wasted. Typical integer types are Byte, Short, Int and Long.

Byte
The definition of Byte is of course eight (8) bits and can contain 256 different values; either 0 to 255 or –128 to 127 if the integer is signed.

Short
The definition of Short depends on the language, compiler or computer the program is used on. However, the typical definition is that Short is sixteen (16) bits. The range of values in 16 bits is 0 to 65535 or –32768 to 32767 if the integer is signed.

Int
The definition of Int (integer) also depends on the language, compiler or computer. The typical definition is either sixteen (16) bits like Short or 32 bits. The range of values in 32 bits is 0 to 4,294,967,295 or –2,147,483,648 to 2,147,483,647 if the integer is signed.

Long
The definition of Long has a similar ambiguous definition like Short and Int. The typical definition is either 32 or 64 bits. The range of values in 64 bits is 0 to 18,466,744,073,709,551,615 or –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 if the integer is signed.

Floating Point
Floating-point values are typically defined to use different amounts of memory. The motivation for this is to conserve memory. If the range of possible values that the numeric floating-point variable is to contain can be stored in that amount of memory, then use that type so that memory is not wasted. Because floating-point numbers record the value to the left and the right of the decimal point, providing greater precision for very small numbers (close to zero) also motivates the different sizes. Typical floating-point types are Float (single) or Double.

Float (single)
Float or single, so named because of the single precision of the floating-point numbers is typically defined to be 32 bits. Depending on the scheme used to encode the floating-point number the range of values is determined. Floating-point numbers are inherently signed so the range is always a plus/minus (+/-) range. The IEEE 754 floating-point standard yields a range of +/-1.4E-45 to +/-3.4028235E+38. This range is in scientific notation.

Double
Double, more aptly name because it provides double the precision of float (single) is typically defined to be 64 bits. Again, depending on the scheme used to encode the floating-point number the range of values is determined. Floating-point numbers are inherently signed so the range is always a plus/minus (+/-) range. The IEEE 754 floating-point standard yields a range of +/-4.9E-324 to +/-1.7976931348623157E+308. This range is in scientific notation.

Scientific Notation

This notation is a short-hand way of representing floating-point numbers. To convert from scientific notation to standard decimal notation take the number on the left of the E and multiply it by ten (10) raised to the power of the number of the right of E. If the number to the right of E is negative then the decimal value will have many digits to the right of the decimal point. If the number to the right of E is positive then the decimal value will have many digits to the left of the decimal point. The number of zeros to place on the left or right (depending on the sign) of the decimal point is one less than the number to the right of E, the number that is raised to the power of ten (10).

For Example,

+1.4E-45 is 0.0000 (for 44 zeros)14

-1.4E-45 is –0.0000 (for 44 zeros)14

+3.4028235E+38 is 34028235 (another 37 zeros).0

-3.4028235E+38 is -34028235 (another 37 zeros).0

Even Larger Numbers
If a variable needs to contain an integer or floating-point number greater than the largest range that is provided by the language, which is a function of the underlying computer processor’s architecture, then this will have to be handled specially in software. An example of this is the BigDecimal and BigInteger objects in the Java Core API math package.

Character

Character variables contain values that represent characters. This class of variable type can either be a single character or a series ("string") of characters. The typical names for these two character types are char and string.

Char
A char is defined to contain a single character. The amount of memory (number of bits, bytes) that a character uses is depended on the encoding scheme used. The ASCII character take 8 bits, 1 byte. Whereas, Unicode characters take 16 bits, 2 bytes.

String
A string is defined to contain a sequence of individual characters. The amount of memory used depends on how much memory is required for each character and how many characters are in the string. Strings are not fixed lengths like all the other type discussed so far and therefore the computer cannot directly process strings. A language must have some system of storing and managing strings.

Array

An array is a computer programming language data structure that contains 1 to N values of a particular variable type. These values are placed in "slots" of the array; each value is in its own slot. The values are typically placed in slots 0 to N-1. An array is a variable, so it has a name, and is of a particular type that matches the variable type of the values to store in the array. A slot in the array is referenced by the integer index number 0 to N-1.

An array has the following components:

Name
The name of the array variable. Examples are: A[], xyz[], cards[]

Type
The type of the array variable matches the type of data that will be stored in the slots of the array. Examples are: int, char, String

Values
The values in an array are the different values stored in the slots of the array. Examples are: {1, 3, 20, 5, 5, 34} for an Int array, {"Sunday", "Monday", "Tuesday", "Wednesday"} for a String array.

Slots
The slots are the different places that the values can be stored. There is only one value at each slot. Different slots could have the same values; they are independent of one another. The slots are referenced by the index number of the slot. Other names for slot are element and position. For example, if we have a String array called DaysOfWeek it could have the values "Sunday", "Monday", "Tuesday", "Wednesday", … in the slots of the array. It would make sense to put the String "Sunday" in the first slot which has an index value of zero (0), "Monday" in the second slot which has an index value of one (1), etc…

Index
The index is the integer value that corresponds to the slot numbers of the array. Typically, the index is an integer from 0 to N-1, where N is the total number of slots. This means that index 0 corresponds to the first slot, index 1 corresponds to the second slot, etc… For example, the value at the third slot of the DaysOfWeek String array is "Tuesday". This is referenced as such DaysOfWeek[2]. Remember that in this example and typically we start indexing the slots with zero (0). That is why the third slot has an index of 2.

Multi-Dimensional Arrays
An array can also be indexed in multiple dimensions. For example a two-dimensional array can be thought to be indexed based on the row and the column index numbers. Such an array is conceptually a rectangle of the number of rows and columns specified in its declaration. Typically, is such an array the first row is index zero (0) and the first column is index zero (0). To process all the elements in the array one would start with the first row and access each column in the row, then proceed to the next row and do all those columns. This is repeated until all the rows are processed. Suppose we declare a two-dimensional array to have two (2) rows and four (4) columns (ie. MyArray[2][4]). To access each element in the array from row one, column one to row two, column four we could use the following scheme.

MyArray[0][0] //row 1, column 1 (remember we index for 0 to N-1)
MyArray[0][1] //row 1, column 2
MyArray[0][2] //row 1, column 3
MyArray[0][3] //row 1, column 4
MyArray[1][0] //row 2, column 1 (notice that the column index number starts over)
MyArray[1][1] //row 2, column 2
MyArray[1][2] //row 2, column 3
MyArray[1][3] //row 2, column 4

Arrays with even more dimensions are possible but become conceptually difficult once they go beyond three dimensions. This is because we humans interact primarily in a three-dimensional world and we can comprehend it. Once we try to think in the fourth and greater dimension we get easily overwhelmed because we have no intuitive frame of reference.

Constants

Constants are declared similar to variables they have a name and a type. The constant contains a value that corresponds to its type and occupies the amount of memory that the type uses. The difference is that the value of the constant never changes while the program is running. It always has the value that it is coded with in the programming language. Constants referenced by name are good to use in a programming language instead of a literal value because if the constant does need to change in the program source code, it can be changed in one place and not everywhere the literal value is coded in the source code.

For example,

constant float pi = 3.14;
circumference = (pi * radius)^2;
area = 2*pi*radius;
//this code fragment uses the constant pi in two places. If we wanted to change pi to have more precision we just have to change the constant pi in its declaration.

circumference = (3.14 * radius)^2;
area = 2*3.14159*radius;
//this code fragment codes the pi value as a literal wherever it is needed. If we wanted to change the value of pi we would have to do it in each place. Notice that this example already shows how different values get coded in a program when using literals.

[Exercise: Programming Basics Constructs Variables]

Expressions

Variables, either explicitly declared or implied variables are assigned values based on some sort of an expression. This can be from a simple assignment of a value to a variable or the assignment that is a result of a complex expression. These expressions can involve arithmetic, logical and comparative operators. Complex expressions introduce the need for an order of precedence that the operators should take. Another useful category of expressions is bitwise operations. Variable can also be assigned as a result of a call to another program or subroutine (discussed later). The implied assignment deals with the case when a value is not assigned to a variable name but a value is obtained from an expression and used as a variable in a program statement. This typically occurs in a conditional statement; this will be discussed later.

Assignment
The most basic expression for a variable is the simple assignment. In this expression a variable is assigned a value. Assignments are made by evaluating the expression on the right side of the assignment operator and storing that result in the memory location referenced by the variable name on the left side of the assignment operator. Typical pseudo code for the assignment operator is ":=". Different languages use different symbols to denote variable assignment but they all have the same idea.

Examples:
Int MyInt;
Int AnotherInt;
MyInt := 3; //a literal assignment
AnotherInt := MyInt; //an assignment to the value of another variable

String MyString;
MyString := "This is a string."; //a literal assignment

Arithmetic
Expressions can involve common arithmetic operations. Most of the operations are available is many computer languages. What is described here is a collection of typical operators and the symbol that usually describes the operation. For addition the + symbol is used. For subtraction the – symbol is used. For multiplication the * symbol is used. For division the / symbol is used. For modulus (the remainder from a division) the % symbol is used. To raise a base number to the power of the exponent the ^ or ** symbol is used (ie. 2^3=8 or 2**3=8, two raised to the power of 3 is 8). However, most languages do not have this operator because of the symbol confusion between * and ** and ^ to denote power and ^ to denote an eXclusive Or operation (discussed later). The + and – symbols can also be used to make a numeric value positive or negative.

Examples:
Int MyInt;
MyInt = 3 + 2 –1;
MyInt = MyInt * 5;
MyInt = MyInt / 2;
MyInt = MyInt % 3;
MyInt = Myint + 2**3;

Logical
Expressions using logical operators yield a Boolean value. In other words the result of the expression is either true or false. The three most ubiquitous Boolean operators are And (&&) , Or (||) , Not (!). The And operator is used to evaluate if all the values are true. The Or operator is used to evaluate if any of the values are true. The Not operator is used to change the value to its opposite. These expression operators are most commonly used in conditional statements (discussed later). Another less common Boolean operator is eXclusive Or (Xor), it is used to evaluate if only one of the values is true. It is more commonly used in bitwise operations (discussed later). These operators are best described in the following Boolean truth tables. In these tables the headings for the rows and columns are 0 meaning false and 1 meaning true. When the particular Boolean operation is performed on the values in the rows and columns the result is put in the table at the intersection of that row and column. Again, 0 means false and 1 means true. Additionally, - means undefined.

Boolean And (&&) Truth Table

And	0	1
0	0	0
1	0	1

Only if both values are true is a true value returned. Otherwise false is returned.

Boolean Or (||) Truth Table

Or	0	1
0	0	1
1	1	1

If any one or both values are true a true value is returned. Only if both values are false is a false returned.

Boolean Not (!) Truth Table

Not	0	1
0	1	-
1	-	0

If both values are false then a true is returned. If both values are true then a false is returned. It is undefined if the values are not the same, since is programming languages this operator is a unary (applied to only one value) then the undefined cases will not occur. It simple changes a true value to false and a false value to true.

Boolean eXclusive Or (^) Truth Table

Xor	0	1
0	0	1
1	1	0

If both values are the same then return false otherwise return true.

Comparison
These operators test for equality or relation between operands and yield a Boolean result. These operators are typically used to test conditions like in conditional statements and branching statements both of these are covered below.

The equality operators are:
Equal (=) which tests if two operands are equal to one another.
Not Equal (<> or !=) which tests if two operands are NOT equal to one another.

The relational operators:
Less Than (<) which tests if the left-side operand is less than the right-side operand.
Greater Than (>) which test if the left-side operand is greater than the right-side operand.

These can also be combines with (=) to test for:
Less Than or Equal (<=)
Greater Than or Equal (>=)

The relational operators can also be used with the Not operator but this produces no new results.

Not Less Than is the same as Greater Than or Equal. !(a<b) same as a>=b
Not Greater Than is the same as Less Than or Equal !(a>b) same as a<=b
Not Less Than or Equal is the same as Greater than !(a<=b) same as a>b
Not Greater Than or Equal is the same as Less than !(a>=b) same as a<b

Precedence
When complex expressions are formed the need to explicitly specify the order in which the expression is evaluated arises. This is called precedence and is denoted by the use of parenthesis’s ( ). Without the parenthesis’s the expression is evaluated based on the operators default order of evaluation. If several operators have the same precedence order then they are evaluated from left to right. This default order may not yield the desired results on an expression.

For example:
If the default precedence is multiplication then addition then,
12 + 4 * 5 = 32. 4 * 5 = 20, 20 + 12 = 32.
(12 + 4) * 5 = 80. (12 + 4) = 16, 16 * 5 = 80.

Bitwise operations
There are times when you may want to manipulate the values of a variable at the binary level. At the binary level you are working with the bits that represent the value in the variable. Several of the operators are similar to the logical expression operators are are based on the same Boolean table results. These are And, Or, Not and Xor. Two other bitwise operators are typically given that shifts the value in the variable to the right or left a specified number of bit positions. The bits that shift out of the variables size unit are lost and the bit value shifted in is typically a zero.

And (&) – each corresponding bit (same bit position) in the two operands are Anded together to produce a new bit value based on the Boolean And table results.

For Example:

A	10011011
B	10101101
Result	10001001

Or (|)– each corresponding bit (same bit position) in the two operands are Ored together to produce a new bit value based on the Boolean Or table results.

For Example:

A	10011011
B	10101101
Result	10111111

Not (~)– each bit in an operand is changed to is complement to produce a new bit value based on the Boolean Not table results.

For Example:

A	10011011
Result	01100100

Xor (^)– each corresponding bit (same bit position) in the two operands are Xored together to produce a new bit value based on the Boolean Xor table results. This is useful to transform two binary values into a third and when the result is Xor with one of the original operands then the other original operand is produced.

For Example:

A	10011011
B	10101101
Result	00110110

Result	00110110	Result	00110110
B	10101101	A	10011011
New Result (A)	10011011	New Result (B)	10101101

Applications of this idea are RAID-5 and the graphical technique that show a selected area that can be cut or copied but when the selection is complete the original graphic image is restored. In other words the drawing of lines to indicate the area being selected does not destroy the original image. This is easily implemented by using the Xor operation on the bits that make up the graphical image.

Left Shift (<<) – the bits in the size unit that contains the variable’s value is shifted to the left for the specified number of positions and the bits on the right are filled with a zero.

For Example:
10011101 << 3 = 11101000
At the first shift we have (1) 00111010, where (1) indicates the bit shifted out of the byte on the left. This bit is lost.
At the second shift we have (0) 01110100
At the third shift we have (0) 11101000

The Left Shift operator is a quick way to multiply by two to the n, where n is the number of bits to shift.

Right Shift (>>)– the bits in the size unit that contains the variable’s value is shifted to the right for the specified number of positions and the bits on the left are filled with whatever the original high bit value was (sign extension).

For Example:
10011101 >> 3 = 11110011
At the first shift we have 11001110 (1), where (1) indicates the bit shifted out of the byte on the right. This bit is lost.
At the second shift we have 11100111 (0)
At the third shift we have 11110011 (1)

The Right Shift operator is a quick way to divide a positive number by two to the n, where n is the number of bits to shift.

Also, a right shift operation that shifts in zeros instead of the high order bit is a common operator provided in languages.

[Exercise: Programming Basics Constructs Expressions]

Conditional/Branching

These types of statements are used to make a decision within a program. The conditional statements evaluate an expression to determine if it is true or false. Based on the result the next instruction to execute in the program will be determined. The ability to evaluate conditions and branch to a new program statement other than the next sequential statement is a fundamental element of computer programming. This is directly related to sequential instruction execution computer concept.

There are two basic forms of conditional statements. The first is the If..Then..Else.. statement and the other is the Select..Case.. statement.

If..Then..Else..
This conditional statement is used to make a decision between two different outcomes. If the result of the expression is true then the statements after the "Then" keyword are executed. Otherwise the statements after the "Else" keyword are executed. After the If..Then..Else statements are conditionally executed then the next statement after the If..Then..Else clause is executed. In the event that the "Else" clause is absent then if the expression evaluates to false then the next statements after the "Then" clause is executed.

For Example:
Wake up and get ready for work
Go outside
If it looks like it will rain today
Then
…..Take an umbrella with you
Else
…..Leave the umbrella at home
Go to work

The Else clause could have been left out. In that case the normal action would be to leave the umbrella at home without explicitly stating so.

Select..Case..
This conditional statement is used to evaluate an expression and select between multiple choices for the next statements to execute.

For Example:
Select type of clothes to wear today based on the weather
Case Sunny and Hot
…..Wear shorts and a T-shirt
Case Raining
…..Wear a raincoat over normal clothes
Case Cool
…..Wear a light jacket
Case Cold and snowing
…..Wear a heavy winter jacket and overcoat

The Select..Case.. conditonal form is a short-hand form of combining multiple If..Then..Else conditonal statements together.

For Example:
If Sunny and Hot
Then
…..Wear shorts and a T-shirt
Else
…..If Raining
…..Then
…..…..Wear a raincoat over normal clothes
…..Else
…..…..If Cool
…..…..Then
…..…..…..Wear a light jacket
…..…..Else
…..…..…..If Cold and snowing
…..…..…..Then
…..…..…..…..Wear a heavy winter jacket and overcoat

Branching/Looping

Another fundamental property of programming the sequential instruction execution computer is the ability to change to program counter to a location other than the next sequential location in memory. This property is known as branching. Conditional statements inherently include branching. When this is combined with the ability to branch to a specified location then loops (repeatedly executing the same statements over again) can be constructed. There are three basic forms of loops and one special form of branching that does not involve a condition to be evaluated. This special form of branching is the unconditional branch. It is exemplified in the Goto statement.

Goto

This statement is used to instruct the computer that the next statement in the program to execute is found at the location where the Goto statement specifies. There are times when this statement is the best or only efficient means to accomplish the required program logic. However, it tends to lend itself to programs that are hard to read and understand. The use of the Goto statement is discouraged in all but the most necessary case. Other branching and looping constructs program structure techniques can be used in place of the Goto statement.

The three basic forms of the loop are each used for a particular type of looping logic.

For...Next
This looping form is to be used when the number of times that the loop should be performed can be pre-determined.

Examples:
For I=1 to 10 //explicitly stated the number of times to loop
…..do something
Next I

For I=1 to Length("Hello World") //calculated beforehand the number of times to loop
…..do something
Next I

While...Loop
This looping form is to be used when a test should be done before the loop should be executed, even the first time. The loop may execute zero (0) to N times, where N may or may not pre-determined.

Example:
While (keyboard input != "Enter")
…..do something
Loop

Loop…Until...
This looping form is to be used when a test should be done after the loop has executed at least one time. The loop will execute one (1) to N times, where N may or may not pre-determined.

Example:
Loop
…..get password input
Until (password is correct)

Input/Output

For a program to communicate with the outside world and thus do something interesting it must be able to perform input and output operations. The basic operations for getting input from the outside world into a computer program and for outputting results are covered in this section.

Read
To get input from an external source into a program the read operation is used.

Examples:
Read input from a keyboard
Read input from a data file

Write
To put output from a program to an external device the write operation is used.

Examples:
Write output to a computer screen
Write output to a printer
Write output to a data file

Open
To establish a communication link between the program and an external device the open operation is used.

Close
To end a communication link between the program and an external device the close operation is used.

Subroutines

To assist in reusing common segments of programs subroutines have been developed. These routines are "sub-programs" that do a particular task that can be reused many times at different point within a main program or even by other programs. There are two basic types of subroutines, the procedure and the function. Each allows parameters to be passed into the subroutine so that the "sub-program" uses the values of those parameters to perform the task it is defined to do. This mechanism allows for the same "sub-program" to be used with different values in the parameters. There are multiple ways in which to pass the parameters to a subroutine, which is briefly discussed here. The function is differentiated from the procedure in that the function returns a value that is produced by that type of subroutine. Subroutines also cause the concept of "scope" to be introduced into programming. Scope has to do with the places where a particular variable is valid. This is an advanced topic that will be covered at a later time.

Procedure
A procedure is a subroutine that optionally takes input parameters and performs a task. It does not explicitly return a value to the program that called the procedure.

Function
A function is a subroutine that optionally takes input parameters and performs a task and returns a result. Though the input parameters are optional most function have input parameters in order to be useful. The value that is returned is defined to be a particular type.

Parameter Passing
Parameters are passed into subroutines by two basic methods. The parameters are either passed into the subroutine by value or by reference. Each of these types is used for different reasons, which will be covered at a later time.

By value – a parameter that is passed by value means that a copy of the value of the variable from the program that calls the subroutine is put in the parameter variable of the subroutine. The subroutine uses that copy of the variable’s value within the subroutine but the value of the variable in the calling program is not affected.

By reference – a parameter that is passed by reference means that the location of the variable from the program that calls the subroutine is referred to by the parameter variable of the subroutine. The subroutine’s parameter variable and the program’s parameter variable are both located in the same place in memory. This implies that as the subroutine changes the value of the variable within the subroutine, the value of the variable in the calling program is also changed.

Return Values
A function returns a result of a particular type to the program that called the function. In the program that called the function a variable, either explicitly or implicitly will be assigned the returned value. The variable must be of the same type that the function returns.

[Exercise: Programming Basics Constructs Conditionals Branching IO Subroutines]

Sequential Execution Computer

The Von Neumann machine, which is the ubiquitous form of the modern computer, is based on the principle that instructions are executed one after the other. The next instruction to execute is the instruction that is at the next sequential memory address, this address is held in the program counter. The program counter can be changed based on a conditional evaluation and a branch instruction that provides the new memory address.

[Index] [Previous]